Search CORE

35 research outputs found

Identifying Similar Meaning In Word Sequences

Author: Uszkoreit Jakob
Publication venue: Technical Disclosure Commons
Publication date: 24/04/2017
Field of study

A system for determining semantic similarity of phrases (word sequences) uses weak supervision of neural networks to generate an embedding space for determining phrase similarity. The training of the neural networks uses input from two orthogonal sources: click distribution of queries and phrase translations

Technical Disclosure Common

Hybrid robust deep and shallow semantic processing for creativity support in document production

Author: Callmeier Ulrich
Eisele Andreas
Schäfer Ulrich
Siegel Melanie
Uszkoreit Hans
Uszkoreit Jakob
Publication venue
Publication date: 01/01/2004
Field of study

The research performed in the DeepThought project (http://www.project-deepthought.net) aims at demonstrating the potential of deep linguistic processing if added to existing shallow methods that ensure robustness. Classical information retrieval is extended by high precision concept indexing and relation detection. We use this approach to demonstrate the feasibility of three ambitious applications, one of which is a tool for creativity support in document production and collective brainstorming. This application is described in detail in this paper. Common to all three applications, and the basis for their development is a platform for integrated linguistic processing. This platform is based on a generic software architecture that combines multiple NLP components and on robust minimal recursive semantics (RMRS) as a uniform representation language

Hochschulschriftenserver - Universität Frankfurt am Main

Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure

Author: McDonald Ryan
Täckström Oscar
Uszkoreit Jakob
Publication venue
Publication date: 01/01/2012
Field of study

It has been established that incorporating word cluster features derived from large unlabeled corpora can significantly improve prediction of linguistic structure. While previous work has focused primarily on English, we extend these results to other languages along two dimensions. First, we show that these results hold true for a number of languages across families. Second, and more interestingly, we provide an algorithm for inducing cross-lingual clusters and we show that features derived from these clusters significantly improve the accuracy of cross-lingual structure prediction. Specifically, we show that by augmenting direct-transfer systems with cross-lingual cluster features, the relative error of delexicalized dependency parsers, trained on English treebanks and transferred to foreign languages, can be reduced by up to 13%. When applying the same method to direct transfer of named-entity recognizers, we observe relative improvements of up to 26%

CiteSeerX

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Neural Paraphrase Identification of Questions with Noisy Pretraining

Author: Das Dipanjan
Duque Thyago
Tomar Gaurav Singh
Täckström Oscar
Uszkoreit Jakob
Publication venue
Publication date: 01/01/2017
Field of study

We present a solution to the problem of paraphrase identification of questions. We focus on a recent dataset of question pairs annotated with binary paraphrase labels and show that a variant of the decomposable attention model (Parikh et al., 2016) results in accurate performance on this task, while being far simpler than many competing neural architectures. Furthermore, when the model is pretrained on a noisy dataset of automatically collected question paraphrases, it obtains the best reported performance on the dataset

arXiv.org e-Print Archive

Crossref